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Abstract 

Practical modifications of deterministic multigrid and conventional relaxation algo- 
rithms are discussed. New parameters need not be tuned but are determined by the 
algorithms themselves. One modification can be thought of as "updating on a last layer 
consisting of a single site" . It eliminates critical slowing down in computations of bosonic 
and fermionic propagators in a fixed volume. Here critical slowing down means diver- 
gence of asymptotic relaxation times as the propagators approach criticality. A remaining 
volume dependence is weak enough in case of bosons so that conjugate gradient can be 
outperformed. However, no answer can be given yet if the same is true for staggered 
fermions on lattices of realizable sizes. Numerical results are presented for propagators of 
bosons and of staggered fermions in 4-dimensional SU (2) gauge fields. 
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1. Introduction 

In Monte Carlo simulations of lattice gauge theories with fermions the most time-consuming 
part is the computation of the gauge field dependent fermion propagators. Conjugate gradient 
(CG) or minimal residual (MR) algorithms are state of the art [p]]. Great hopes to do better are 
attached to multigrid (MG) methods [2-10]. In Ref. JT(| the first MG computations without 
critical slowing down (CSD) in non-Abelian gauge fields (4-d SU(2)) were presented. They 
prove that the MG method can cope with the frustration which is inherent in non-Abelian gauge 
fields. However, elimination of CSD succeeded only when an "optimal" interpolation kernel 
PH| , |TT| was used. The use of this optimal kernel for production runs is impractical because of 
computational complexity and storage space requirements. 

In this letter practical modifications of MG and conventional relaxation algorithms for 
propagators are discussed. A propagator (f> is the solution of a linear equation 

D<P = f (1) 

on a d dimensional lattice A of sites z, for given /. In our case, D = —A + m 2 for bosons, and 
D = —]/) 2 + m 2 for fermions, where A and Ld are the gauge covariant Laplace or Dirac operators 
(with periodic boundary conditions) respectively. Color indices are always suppressed, and 4>(z) 
is an N c x N c matrix where is the number of colors. 

CSD in computations in a fixed volume can be eliminated by "updating on a last layer 
consisting of a single site". Given an approximation to 0, this updating amounts to 
rescaling (j)^ 1 ' by an A(. x iV c matrix ft (in case of bosons or Wilson fermions) : 

0<»>(*)-0W(z)n , n = (^\D^y 1 (^,f) , (2) 

where 

fa*) = mE^)%) • (3) 

l A l ZgA 

We will discuss Eq. (|j) and its generalization for staggered fermions, and related modifi- 
cations in Sec. 2. Numerical results for propagators of bosons and of staggered fermions in 
4-dimensional SU(2) gauge fields will be presented in Sec. 3. 

2. Improving algorithms from a variational point of view 
2.1. Updating on a l d MG layer 

It is well known that the CSD of conventional iterative algorithms for solving Eq. (fl]) 
depends only on m 2 and not on the lattice size |A|. There is only an implicit dependence on 
|A| (and (3) through the value of m 2 r , where — m 2 r denotes the lowest eigenvalue of —A or —J/) 2 . 
The dimension d enters in the scaling relation for relaxation times only through the constant 
of proportionality. Therefore one continues to have CSD on a lattice of only 2 d sites, and it 
seems necessary to go to a l d lattice in order to eliminate the appearance of CSD. 

When we update on a l d sublattice, we make the replacement 

( z ) ( z ) + A(z) (0-1) . (4) 

Here A denotes a kernel which interpolates directly from a l d sublattice to A. ft — 11 is the error 
of 0^ represented at the last site. In the MG context, ft - 1 = (C*, DA) l (C*, f - D(f>^), 
where C* is the adjoint of the restriction operator which averages to a l d layer. From the 
variational point of view (VPV) an algorithm is set up in such a way that the functional 



2 



K[(j)\ = ~ < cj), D<p > — < 0, f > = N c 1 Tr -D0) — {(f), /)] is lowered as far as possible in 

every iteration. This leads to C* = A. From the VPV the optimal A in @ equals <j)^ n \ Thus, 
we obtain Eq. (0). Considered from the VPV alone without thinking of MG, one obtains (fj) 
by rescaling (j)^ with a matrix Q that is determined such that K[(j)^Q] is as low as possible. 

In case of staggered fermions we have to consider that there are 2 d different pseudoflavors 0. 
In the limiting case of a pure gauge the fermionic problem amounts to computing 2 d decoupled 
bosonic propagators. Hence for staggered fermions we replace (H) by 

0W(z)^0W(z)O(^(z)) , (5) 

where H(z) denotes the pseudoflavor of z. Now the expression for Q(H) is more complicated 
than that given in (fj). In practice, we determine the f2(if)'s by solving one linear N^2 d x N^2 d 
system, or - making use of the independence of the even and odd sublattices - by solving two 
systems of a quarter of that size. 

2.2. Related improvements from a VPV 

Some related modifications of (MG) relaxation algorithms will be discussed now. The new 
parameters are not tunable, they are all determined by the algorithms themselves. 

2.2.1. Modified MG correction updating step. In conventional MG approaches one considers 
updates of the form 0^ 1— > < - n ) +(p^ where ^ is obtained by interpolation of an approximate 
solution of a residual equation on a coarser lattice. We propose to generalize this to 

(n) {z) 1 — ► (n) = (n) {z) n + <^ (n) (z) e . (6) 

The two N c x N c matrices f2 and G are chosen such that .ff [0( n )] is minimized. In particular, 
this proposal may be an improvement in algorithms where the residual equation is only solved 
approximately, or in algorithms were coarse grid operators are not defined through the Galerkin 
prescription. For staggered fermions Q and in @ become pseudoflavor dependent. 

2.2.2. Modified checkerboard SOR. Consider for illustration the bosonic problem. When we 
update at the even sites, we propose to modify SOR according to 

(n) (z) ^ <p^\z) n + ^ n \z) 6 if z is even , 

(n) (z) ^ (n) (z) E if z is odd , (7) 

where <pW(z) = {2d + m 2 )~ l [f{z) +E z 'n.n.zU(z,z r )^ n \z% and U(z, z') is the gauge field on 
the link (z,z'). Again, the matrices Q, G and 2 should be chosen such that the functional K 
gets minimized. The proposal (|7D expresses the view that in gauge theories one should have 
relaxation matrices rather than relaxation parameters that are real numbers. 

2.2.3. Modified damped Jacobi relaxation. Damped Jacobi relaxation can be generalized accord- 
ing to ([]) with ip( n > = (2d + m 2 ) -1 r^ n \ If one fixes = 1, one recovers the MR algorithm that 
was used by Hulsebos et al. P, f| 

Finally we note that the proposals (0), (|5[), (|B|) and (□) respect gauge covariance. Iterative 
algorithms are gauge covariant in the sense that all <fffi are gauge transformed by g if g is 
applied before relaxation is started. Q is gauge invariant (or transforms like a matter field 
sitting at site w in the adjoint representation, i. e. O 1— > g w VLg~ l , when / — > 8 ZjW ) etc. 
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3. Results for propagators in 4-dimensional SU(2) gauge fields 

It is well known || that the convergence rate of iterative algorithms for propagators is 
governed by the condition number^] of (—A + m 2 ) or (— jp 2 + m 2 ). Therefore the CSD scaling 
relation for relaxation times r reads 

r oc (Am 2 ) -2 / 2 for small Am 2 = m 2 — m 2 r , (8) 

where z denotes the critical exponent, r is defined by the asymptotic exponential decay of 
the norm of the residual. We measure all r's and number of iterations in units of cycles 
which involve only one sweep through the finest lattice. The variational MG method used for 
solving Eq. (HD is described in some detail in Refs. |7|, [U], |TTJ. Its implementation is actually 
a twogrid algorithm where the residual equation is solved exactly by CG. A scale factor of 3 
is chosen in blocking. We use the gauge covariant ground-state projection MG method. An 
efficient algorithm for computing averaging kernels C was described in Ref. 1[L2|| , and was used 
in this work. The (non-optimized) implementations of the MG programs require a factor of 
3.2/7.8 (bosons/staggered fermions) more arithmetic operations than CG when updating on a 
l d sublattice is included. Without that inclusion the factor is 2.1/4.5. The amount of work for 
C on the whole lattice is equivalent to less than 20 CG iterations. 

3.1. Propagators in pure gauges 

In pure gauges the bosonic and fermionic problems are equivalent. There MG does a perfect 
job, CSD is completely eliminated with short r's and z = 0, Ref. £f(|. Rescaling @ does not 



improve the performance any further. The improved correction scheme ([|), however, is able to 
halve r and to bring it close to 1. Combining (0) with conventional 1-grid relaxation causes 
CSD (in the sense stated above) to disappear. This is shown in Fig. 1. The norm of the residual 
is not monotonically decreased for small m 2 . This feature also shows up in CG which minimizes 
K (i. e. the residual r in the norm induced by the scalar product < ■ , D^ 1 ■ >) rather than 
||r|| itself. This is an interesting point: Instead of (0), one could think of rescaling (j)^ by 
another matrix Q' which is chosen such that ||r( n )|| is minimized. However, in this case there 
is no difference to 1-grid relaxation. This remains true in nontrivial gauge fields. Finally we 
note that practically Q = 1 as soon as ||r|| decays exponentially. Then the step (|2|) could be 
switched off. This statement holds also in nontrivial gauge fields and for MG. 

3.2. Bosonic propagators in nontrivial gauge fields 

The remarks made above about 1-grid relaxation plus (0) apply in nontrivial gauge fields 
as well. Fig. 1 looks the same when m 2 is replaced by Am 2 . MG plus (@) beats CSD, too. 
Compared with CG and 1-grid plus (|2J), this modified MG performs better the more critical 
the system is. Convergence for Am 2 = 10 -6 on 12 4 and 18 4 lattices is shown in Fig. 2.0 It 
must be emphasized that f2 really needs to be a matrix. Simple rescaling of (f)^ with a real 
number does not succeed in eliminating CSD.0 The improved MG correction scheme @ also 
beats CSD. However, at finite f3 it is not able to halve r's. Using the modified SOR version (0) 
does not pay. If one fixes S = 11, one has nothing else but conventional Gauss-Seidel relaxation. 
Retaining S yields little difference in performance. 

3.3. Propagators of staggered fermions in nontrivial gauge fields 

1 H. e. the ratio of the largest to the smallest eigenvalue 

2 ) Using SOR as a smoother contradicts the conventional MG wisdom. However, in nontrivial gauge helds 
any over-relaxation yields better performance than Gauss-Seidel and much better performance than damped 
Jacobi or MR. A similar result was reported in a recent paper on MG gauge fixing 

3 -*A remark about 1-grid plus @ is in order here: It can be proved by induction that in SU(2) gauge helds 
il oc 11 always holds if one starts relaxation with c/A°) = 0. However, Q oc 11 is not true in general. 
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In case of staggered fermions m 2 r is much closer to zero than in case of bosons. Actually, 
it is often assumed that the finite value of m 2 r can be neglected completely in (f|) so that 
r oc m~ z . However, this neglect of m 2 r is not justified on lattices amenable in size to date. 
For bosons, scaling (§) without violations is observed for Am 2 ^,0.01, Ref. JTOf. For staggered 



fermions it was verified that the scaling relation (§) also holds. The critical exponent z equals 
2 for conventional relaxation (fixed u) and for MG without (|5]). It was also verified that the 
constant of proportionality in (|8p is independent of the lattice size. This statement is true for 
1-grid as well as for MG algorithms. To obtain these results requires a careful analysis of data. 
The subtle point is that for fermions (|8p is obeyed only for Am 2 <,0.001, and asymptotic decay 
in the sense that only the slowest mode governs convergence does not set in before 400 - 500 
iterations in 4-dimensional SU{2) gauge fields. 

The lesson is that m 2 r must not be neglected, at least for lattices up to 18 4 . In practice 
this might mean that on such relatively small lattices m 2 must also be given small negative 
values in order to study effects of CSD. This is of course artificial, but it is necessary when one 
wants to obtain results for z which are reliable for predictions how algorithms perform on large 
lattices. If one does not investigate systems close enough to criticality and if one does not take 
care that decay rates are really asymptotic, there is the danger of extracting wrong values for 
z, even in case of pure gauges. 

If one is interested in the question how well algorithms perform on lattices of sizes that 
are used in present day simulations, the foregoing discussion might appear too academic. It 
might appear more natural to ask how many iterations are needed to obtain a given accuracy. 
From this viewpoint it turned out that MG is competitive or even superior to conventional 
algorithms in 2-d models 0, ||. However, it is more difficult to reach decisive conclusions in 
d = 4. Preliminary results on 16 4 || and 18 4 [|14| lattices (both at (3 = 2.7) indicated that the 



MG methods tested so far will not be able to outperform CG. But we expect that the situation 



will be different on larger lattices. Details will be reported elsewhere [15 



We carry on with results of the modifications proposed in this article. Table 1 gives a survey 
of convergence on 12 4 and 18 4 lattices. Rescaling (|5]) does not pay for positive m 2 on small 
lattices. But including ([5]) in algorithms brings z down to zero, i. e. CSD is eliminated. An 
analogue of Fig. 2 for staggered fermions is given with Fig. 3. Mind, however, that relaxation 
algorithms for fermions are used with lexicographic, not checkerboard, updating. 

We conclude this section with remarks on MR. Since MR can be viewed as an optimized 
Jacobi relaxation, it comes as no surprise that its convergence properties are worse than those 
of SOR. Working with pseudoflavor dependent matrices leads to no practical improvement. 
Over- relaxed MR versions have not been investigated yet. 

3.4. Cautionary Remark 

To the author's knowledge there exists no study in the literature where m 2 r is not disregarded 
in case of staggered fermions. This neglect is only justified by the smallness of m 2 r but it has 
never been checked whether the neglect is justified. A result of the present work is the validity 
of the relation r = const/ Am 2 in 1-grid and variational MG relaxation, with a constant which 
is independent of the lattice size. Therefore the study of the asymptotic behavior of r when the 
linear extension of the lattice and I /Am are changed proportionally, can be determined from 
studies at fixed volume. Elimination of the 1/Am 2 divergence on a lattice of fixed size implies 
the absence of CSD in computations where all quantities are scaled appropriately. Certainly, 
in physical applications (Monte Carlo simulations) the inverse mass should be smaller than the 
extension of the lattice, but this is an aspect of finite size effects on physical observables. 
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By neglecting m c 2 r and trying to determine z under scaling conditions, one can at most 
obtain some effective Zgff. This Zgff contains however a great deal of arbitrariness and cannot 
be defined uniquely. One can run into difficulties with this procedure H. The author admits 
that a Zeff is of more practical relevance as long as numerical simulations are limited to lattice 
sizes where m is not really small. But we look for algorithms which can be used in future 
large scale computations, and for these it will be z and not z^g which governs CSD. There 
is one weak point in this reasoning, and that is the remaining volume effect of (j^). z = is 
valid asymptotically, but it takes longer to reach the asymptotic regime the larger the lattice 
becomes. Therefore one might have to go back to a z^ff, but this is an open question. 

4. Conclusions 

Updating on a last l d MG layer provides an astonishingly simple modification which elim- 
inates CSD of asymptotic relaxation times in MG and even in 1-grid relaxation algorithms for 
propagators. As soon as the decay of the error is exponential, updating on the l d sublattice 
can be switched off. Therefore additional work must only be invested in the initial and in an 
intermediate stage of computations. Since the modified algorithms have z = 0, we expect that 
CG will eventually be outperformed. What remains, however, is a volume dependence on how 
fast the asymptotic regime is reached. For bosons it was shown that CG is outperformed. But 
we feel unable to predict whether the same methods will pay for staggered fermions on lattices 
of realizable sizes. 
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Table 1 Convergence in computations of propagators of staggered fermions with f{z) = 6 Z! q in 4-d 
SU(2) gauge fields. Given is the number of iterations necessary for reducing ln||r^|| by 10. All 
propagators are initialized with zero. 1-grid and MG SOR are swept in lexicographic ordering. 



pure gauge configurations {j3 = oo) 


algorithm and lattice size 


10" 1 


10" 2 


m 
10" 3 


10" 4 


10" 5 


10" 6 


CG on 12 4 


17 


17 


17 


17 


17 


17 


CG on 18 4 


28 


30 


33 


35 


37 


39 


1-grid SOR plus (g), u = 1.90, on 12 4 


105 


125 


155 


195 


230 


270 


1-grid SOR plus (g), u = 1.90, on 18 4 


110 


125 


155 


195 


230 


270 


MG SOR without (|5 


),u = 1.09, on 12 4 


21 


23 


23 


23 


23 


23 


MG SOR without (|5 


),u = 1.09, on 18 4 


20 


23 


23 


23 


23 


23 


nontrivial gauge fields {(3 = 2.7) 


algorithm and lattice size 


10" 1 


lO" 2 


Arr 
10" 3 


a = 
10~ 4 


10" 5 


10" 6 


CG on 12 4 


65 


175 


265 


300 


320 


340 


CG on 18 4 


65 


180 


350 


415 


455 


495 


1-grid SOR plus (g), u = 1.90, on 12 4 


100 


130 


530 


560 


630 


710 


1-grid SOR plus (@), uj = 1.90, on 18 4 


95 


125 


710 


1090 


1320 


1540 


MG SOR plus (D, u) = 1.96, on 12 4 


180 


185 


385 


425 


475 


535 


MG SOR plus (|), lo = 1.96, on 18 4 


180 


185 


530 


775 


950 


1070 



Figure captions 

Fig. 1 1-grid SOR plus @ eliminates CSD, shown here for a pure gauge, (u = 1.90, checkerboard 
updating) The 6 curves correspond to m 2 = 0.1,0.01, . . . , 10 -6 on a 12 4 (non-staggered) lattice with 
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m decreasing from left to right. A small volume effect remains, not for r (i. e. the asymptotic decay 
rate) but with respect to how fast the asymptotic regime is reached: e. g. on an 18 4 lattice the number 
of iterations needed to obtain a given accuracy of ||r|| for m 2 = 10 -6 is increased by ~ 20. The figure 
looks the same for bosons in nontrivial gauge fields when m? is replaced by Am 2 . 

Fig. 2 Convergence for bosonic propagators with Am 2 = 10~ 6 in quenched A-d SU{2) gauge fields 
equilibrated with Wilson's action at [3 = 2.7. The numbers refer to the following algorithms: 1/2: 
variational MG SOR (to = 1.50) plus © on a 12 4 /18 4 lattice; 3/4: 1-grid SOR (to = 1.91) plus (|) 
on a 12 4 /18 4 lattice; 5/6: CG on a 12 4 /18 4 lattice. Relaxation algorithms are swept in checkerboard 
fashion. The critical masses are m 2 r = -0.7726281/-0.7554339. Without (g) MG SOR and 1-grid 
SOR haver's of O(10 5 ) 0. 

Fig. 3 Convergence for propagators of staggered fermions with Am 2 = 10 -6 in quenched 4-d SU (2) 
gauge fields at (3 = 2.7. The numbers refer to the following algorithms: 1/2: variational MG SOR 
(u = 1.96) plus (D on a 12 4 /18 4 lattice; 3/4: 1-grid SOR (w = 1.90) plus (|) on a 12 4 /18 4 lattice; 
5/6: CG on a 12 4 /18 4 lattice. Relaxation algorithms are swept in lexicographic ordering. The critical 
masses are m 2 r = -0.0368447/-0.0096640. Without (|) MG SOR and 1-grid SOR have r's of O(10 5 ). 
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